Clustering What Matters: Optimal Approximation for Clustering with Outliers
نویسندگان
چکیده
Clustering with outliers is one of the most fundamental problems in Computer Science. Given a set X n points and two numbers k m, clustering aims to exclude m from X, partition remaining into clusters that minimizes certain cost function. In this paper, we give general approach for solving outliers, which results fixed-parameter tractable (FPT) algorithm (i.e., an running time form f(k, m) * poly(n) some function f), almost matches approximation ratio its outlier-free counterpart. As corollary, obtain FPT algorithms optimal ratios k-Median k-Means Euclidean metrics. We also exhibit more applications our other variants problem impose additional constraints on clustering, such as fairness or matroid constraints.
منابع مشابه
Approximation Schemes for Clustering with Outliers
Clustering problems are well-studied in a variety of fields such as data science, operations research, and computer science. Such problems include variants of centre location problems, k-median, and k-means to name a few. In some cases, not all data points need to be clustered; some may be discarded for various reasons. For instance, some points may arise from noise in a data set or one might b...
متن کاملApproximation Algorithms for Clustering Problems with Lower Bounds and Outliers
We consider clustering problems with non-uniform lower bounds and outliers, and obtain the first approximation guarantees for these problems. We have a setF of facilities with lower bounds {Li}i∈F and a setD of clients located in a common metric space {c(i, j)}i,j∈F∪D, and bounds k, m. A feasible solution is a pair ( S ⊆ F , σ : D 7→ S ∪ {out} ) , where σ specifies the client assignments, such ...
متن کاملSublinear Projective Clustering with Outliers
Given a set of n points in <d, a family of shapes S and a number of clusters k, the projective clustering problem is to find a collection of k shapes in S such that the maximum distance from a point to its nearest shape is minimized. Some special cases of the problem include the k-line center problem where the goal is to cover the points with minimum radius hypercylinders and the k-hyperplane c...
متن کاملBayesian Clustering with Outliers and Missing Values
The Bayesian Robust Mixture Model (BRMM) is a fully probabilistic model for grouping realvalued data into a finite number of clusters. The model is robust in the sense that it tolerates outliers in the data and handles missing values, both within the Bayesian inference framework. Foreword The purpose of this report is to provide a detailed, step-by-step derivation of the variational update equa...
متن کاملApproximation Algorithms for Bregman Clustering Co-clustering and Tensor Clustering
The Euclidean K-means problem is fundamental to clustering and over the years it has been intensely investigated. More recently, generalizations such as Bregman k-means [8], co-clustering [10], and tensor (multi-way) clustering [40] have also gained prominence. A well-known computational difficulty encountered by these clustering problems is the NP-Hardness of the associated optimization task, ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence
سال: 2023
ISSN: ['2159-5399', '2374-3468']
DOI: https://doi.org/10.1609/aaai.v37i6.25818